Updating the SVD

In many applications which are based on the SVD, arrival of new data requires SVD of the new matrix. Instead of computing from scratch, existing SVD can be updated.

Prerequisites

The reader should be familiar with concepts of singular values and singular vectors, related perturbation theory, and algorithms.

Competences

The reader should be able to recognise applications where SVD updating can be sucessfully applied and apply it.

Facts

For more details see M. Gu and S. C. Eisenstat, A Stable and Fast Algorithm for Updating the Singular Value Decomposition and M. Brand, Fast low-rank modifications of the thin singular value decomposition and the references therein.

  1. Let $A\in\mathbb{R}^{m\times n}$ with $m\geq n$ and $\mathop{\mathrm{rank}}(A)=n$, and let $A=U\Sigma V^T$ be its SVD. Let $a\in\mathbb{R}^{n}$ be a vector, and let $\tilde A=\begin{bmatrix} A \\ a^T\end{bmatrix}$. Then $$\begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} \begin{bmatrix} \Sigma \\ a^TV \end{bmatrix} V^T. $$ Let $\begin{bmatrix} \Sigma \\ a^T V \end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. This SVD can be computed in $O(n^2)$ operations. Then $$\begin{bmatrix} A \\ a^T\end{bmatrix} = \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T V^T \equiv \tilde U \bar \Sigma \tilde V^T $$ is the SVD of $\tilde A$.

  2. Direct computation of $\tilde U$ and $\tilde V$ requires $O(mn^2)$ and $O(n^3)$ operations. However, these multiplications can be performed using Fast Multipole Method. This is not (yet) implemented in Julia and is "not for the timid" (quote by Steven G. Johnson).

  3. If $m<n$ and $\mathop{\mathrm{rank}}(A)=n$, then $$ \begin{bmatrix} A \\ a^T\end{bmatrix} =\begin{bmatrix} U & \\ & 1 \end{bmatrix} \begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} \begin{bmatrix} V^T \\ v^T \end{bmatrix}, $$ where $\beta=\sqrt{\|a\|_2^2-\|V^T a\|_2^2}$ and $v=(I-VV^T)a$. Notice that $V^Tv=0$ by construction. Let $\begin{bmatrix} \Sigma & 0 \\ a^T V & \beta\end{bmatrix} = \bar U \bar \Sigma \bar V^T$ be the SVD of the half-arrowhead matrix. Then $$\begin{bmatrix} A \\ a^T\end{bmatrix} = \begin{bmatrix} U & \\ & 1 \end{bmatrix} \bar U \bar\Sigma \bar V^T \begin{bmatrix} V^T \\ v^T \end{bmatrix} \equiv \tilde U \bar \Sigma \tilde V^T $$ is the SVD of $\tilde A$.

  4. Adding a column $a$ to $A$ is equivalent to adding a row $a^T$ to $A^T$.

  5. If $\mathop{\mathrm{rank}}(A)<\min\{m,n\}$ or if we are using SVD approximation of rank $r$, and if we want to keep the rank of the approximation (this is the common case in practice), then the formulas in Fact 1 hold approximately. More precisely, the updated rank $r$ approximation is not what we would get by computing the approximation of rank $r$ of the updated matrix, but is sufficient in many applications.

Example - Adding row to a tall matrix

If $m\geq n$, adding row does not increase the size of $\Sigma$.


In [1]:
# pkg> add Arrowhead#master
using Arrowhead, LinearAlgebra

In [2]:
function mySVDaddrow(svdA::SVD,a::Vector)
    # Create the transposed half-arrowhead
    m,r,n=size(svdA.U,1),length(svdA.S),size(svdA.V,1)
    T=typeof(a[1])
    b=svdA.Vt*a
    if m>=n || r<m
        M=HalfArrow(svdA.S,b)
    else
        β=sqrt(norm(a)^2-norm(b)^2)
        M=HalfArrow(svdA.S,[b;β])
    end
    # From Arrowhead package
    svdM,info=svd(M)
    # Return the updated SVD
    if m>=n || r<m
        return SVD([svdA.U zeros(T,m); zeros(T,1,r) one(T)]*svdM.V, 
            svdM.S, adjoint(svdA.V*svdM.U))
    else
        # Need one more row of svdA.V - v is an orthogonal projection
        v=a-svdA.V*b
        normalize!(v)
        return SVD([svdA.U zeros(T,m); zeros(T,1,r) one(T)]*svdM.V, 
            svdM.S, adjoint([svdA.V v]*svdM.U))
    end
end


Out[2]:
mySVDaddrow (generic function with 1 method)

In [3]:
methods(SVD)


Out[3]:
# 1 method for type constructor:

In [4]:
import Random
Random.seed!(421)
A=rand(10,6)
a=rand(6)


Out[4]:
6-element Array{Float64,1}:
 0.33696435480910214
 0.916644781291106
 0.83277664059846
 0.8448238239288268
 0.8866516008033594
 0.3443212111724143

In [5]:
svdA=svd(A)


Out[5]:
SVD{Float64,Float64,Array{Float64,2}}
U factor:
10×6 Array{Float64,2}:
 -0.303731  -0.160825    0.398238   -0.354682    -0.0983823   0.0874599
 -0.308381   0.324931   -0.185955    0.218783    -0.212191    0.131507
 -0.292885   0.37671     0.0219543   0.282351    -0.358107    0.416593
 -0.37317   -0.601881   -0.578251   -0.00175991   0.145133    0.0924109
 -0.243567   0.363586   -0.103267   -0.290673     0.442374   -0.0129275
 -0.356062  -0.382149    0.405435    0.358618    -0.279737   -0.268595
 -0.276087   0.0488246   0.359745    0.398301     0.710811    0.0543865
 -0.357789  -0.0756382   0.0482742  -0.409233    -0.0341779   0.470945
 -0.391551   0.24166    -0.345022    0.0810547   -0.0586923  -0.55629
 -0.20966    0.140166    0.214808   -0.448137    -0.110358   -0.431799
singular values:
6-element Array{Float64,1}:
 4.0443206608615325
 1.3339228353170136
 1.0423217489812764
 0.9192804665679337
 0.5577167598034549
 0.3174781822238693
Vt factor:
6×6 Array{Float64,2}:
 -0.430615  -0.536704  -0.406901   -0.37864    -0.288732   -0.366357
 -0.189618   0.125828   0.651036    0.156569   -0.222344   -0.671129
 -0.375426   0.548351   0.0473998  -0.325618   -0.562704    0.365319
  0.339005  -0.308096   0.517876   -0.700375    0.0352901   0.173744
 -0.715287  -0.218191   0.321263    0.0955378   0.459078    0.343027
  0.105415  -0.502889   0.192176    0.475856   -0.581862    0.366137

In [6]:
norm(A*svdA.V-svdA.U*Diagonal(svdA.S))


Out[6]:
7.973965710771358e-15

In [7]:
typeof(svdA)


Out[7]:
SVD{Float64,Float64,Array{Float64,2}}

In [8]:
svdAa=mySVDaddrow(svdA,a)


Out[8]:
SVD{Float64,Float64,Array{Float64,2}}
U factor:
11×6 Array{Float64,2}:
 -0.276987   0.203469    0.326481   -0.410555   0.145536     0.065527
 -0.285338  -0.271956   -0.0348243   0.330933   0.262804     0.0867921
 -0.268963  -0.297031    0.191951    0.357817   0.357062     0.257841
 -0.341629   0.573595   -0.599873    0.116416  -0.00903236   0.180141
 -0.231475  -0.360711   -0.095999   -0.225752  -0.152295     0.244164
 -0.319504   0.460633    0.435275    0.243196   0.0404618   -0.381818
 -0.254208   0.0039304   0.393724    0.246646  -0.656236     0.291795
 -0.32926    0.101391    0.0121052  -0.35485    0.236027     0.462658
 -0.364495  -0.21389    -0.234478    0.216498   0.139691    -0.419086
 -0.19529   -0.119094    0.166178   -0.440219   0.180764    -0.354866
 -0.391301  -0.234263   -0.243032   -0.206924  -0.467766    -0.288235
singular values:
6-element Array{Float64,1}:
 4.386608202014708
 1.3697132301104165
 1.0721964985331782
 0.9453326789483859
 0.6749415350129048
 0.35737264502281824
Vt factor:
6×6 Array{Float64,2}:
 -0.394539  -0.536976   -0.420871  -0.398104  -0.324469  -0.339258
  0.260977  -0.0788592  -0.604403  -0.209693   0.112295   0.70978
 -0.199578   0.492155    0.211532  -0.40749   -0.650133   0.290661
  0.519908  -0.404144    0.50087   -0.558732   0.055077   0.0166614
  0.678824   0.115075   -0.305614   0.213431  -0.51228   -0.352948
 -0.072745  -0.535394    0.261817   0.52322   -0.440423   0.414466

In [9]:
Aa=[A;transpose(a)]
println(size(Aa),size(svdAa.U),size(svdA.V))
[svdvals(Aa) svdAa.S]


(11, 6)(11, 6)(6, 6)
Out[9]:
6×2 Array{Float64,2}:
 4.38661   4.38661
 1.36971   1.36971
 1.0722    1.0722
 0.945333  0.945333
 0.674942  0.674942
 0.357373  0.357373

In [10]:
# Check the residual and orthogonality
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[10]:
(8.169456943601558e-15, 1.860153478103577e-15, 2.6152984582098762e-15)

Example - Adding row to a flat matrix


In [11]:
# Now flat matrix
Random.seed!(421)
A=rand(6,10)
a=rand(10)
svdA=svd(A);

In [12]:
A


Out[12]:
6×10 Array{Float64,2}:
 0.345443  0.17008   0.608612  0.766264   …  0.0579196  0.678276  0.0327699
 0.68487   0.525208  0.346561  0.810683      0.973456   0.114236  0.996598
 0.650991  0.785847  0.561248  0.198694      0.343372   0.680339  0.708267
 0.973053  0.135538  0.915812  0.854638      0.280957   0.104854  0.599061
 0.105135  0.958365  0.605095  0.905889      0.281784   0.130086  0.16948
 0.77247   0.560486  0.83639   0.0936446  …  0.302895   0.909776  0.124078

In [13]:
Aa


Out[13]:
11×6 Array{Float64,2}:
 0.345443  0.958365  0.198694   0.532695   0.115946   0.680339
 0.68487   0.560486  0.854638   0.470911   0.301274   0.104854
 0.650991  0.608612  0.905889   0.381798   0.0579196  0.130086
 0.973053  0.346561  0.0936446  0.664831   0.973456   0.909776
 0.105135  0.561248  0.651562   0.692733   0.343372   0.0327699
 0.77247   0.915812  0.37833    0.0414614  0.280957   0.996598
 0.17008   0.605095  0.834811   0.100532   0.281784   0.708267
 0.525208  0.83639   0.353274   0.848523   0.302895   0.599061
 0.785847  0.766264  0.831302   0.627814   0.678276   0.16948
 0.135538  0.810683  0.217897   0.494844   0.114236   0.124078
 0.336964  0.916645  0.832777   0.844824   0.886652   0.344321

In [14]:
svdAa=mySVDaddrow(svdA,a);

In [15]:
Aa=[A;transpose(a)]
println(size(Aa),size(svdAa.U),size(svdA.V))
[svdvals(Aa) svdAa.S]


(7, 10)(7, 7)(10, 6)
Out[15]:
7×2 Array{Float64,2}:
 4.44188   4.44188
 1.41235   1.41235
 1.2192    1.2192
 0.985345  0.985345
 0.49206   0.49206
 0.456045  0.456045
 0.26585   0.26585

In [16]:
# Check the residual and orthogonality
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[16]:
(9.773004839402197e-15, 1.6518631028362133e-15, 4.8346566025180166e-15)

Example - Adding columns

This can be viewed as adding rows to the transposed matrix, an elegant one-liner in Julia.


In [17]:
function mySVDaddcol(svdA::SVD,a::Vector)
    X=mySVDaddrow(SVD(svdA.V,svdA.S,adjoint(svdA.U)),a)
    SVD(X.V,X.S,adjoint(X.U))
end


Out[17]:
mySVDaddcol (generic function with 1 method)

In [21]:
# Tall matrix
Random.seed!(897)
A=rand(10,6)
a=rand(10)
svdA=svd(A)
svdAa=mySVDaddcol(svdA,a);

In [22]:
# Check the residual and orthogonality
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[22]:
(2.916534636777345e-15, 2.4065592454965646e-15, 1.5422239624507888e-15)

In [23]:
# Flat matrix
Random.seed!(332)
A=rand(6,10)
a=rand(6)
svdA=svd(A)
svdAa=mySVDaddcol(svdA,a);

In [24]:
# Check the residual and orthogonality
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[24]:
(3.3805679383519217e-15, 2.0590068407357257e-15, 1.9222177303623797e-15)

In [25]:
# Square matrix
A=rand(10,10)
a=rand(10)
svdA=svd(A);

In [26]:
svdAa=mySVDaddrow(svdA,a)
Aa=[A;transpose(a)]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[26]:
(5.617414775356018e-14, 2.7121157027868105e-15, 3.3018690692786864e-15)

In [27]:
svdAa=mySVDaddcol(svdA,a)
Aa=[A a]
norm(Aa*svdAa.V-svdAa.U*Diagonal(svdAa.S)),
 norm(svdAa.U'*svdAa.U-I), norm(svdAa.Vt*svdAa.V-I)


Out[27]:
(5.6791205710286555e-14, 2.951670586067795e-14, 4.682214349907592e-15)

Example - Updating a low rank approximation


In [28]:
# Adding row to a tall matrix
A=rand(10,6)
svdA=svd(A)
a=rand(6)
# Rank of the approximation
r=4


Out[28]:
4

In [29]:
svdAr=SVD(svdA.U[:,1:r], svdA.S[1:r],adjoint(svdA.V[:,1:r]));

In [30]:
# Eckart, Young, Mirsky
Ar=svdAr.U*Diagonal(svdAr.S)*svdAr.Vt
Δ=Ar-A
opnorm(Δ),svdvals(A)[5]


Out[30]:
(0.7079785575104238, 0.7079785575104239)

In [31]:
svdAa=mySVDaddrow(svdAr,a);

In [32]:
Aa=[A; transpose(a)];

In [33]:
svdvals(Aa),svdvals([Ar;transpose(a)]),svdAa.S


Out[33]:
([3.974688600527959, 1.2989639055449105, 1.1593964227176732, 0.9516010740072087, 0.7363080325810998, 0.43136995644832393], [3.9746532247711968, 1.2936632729078836, 1.1593893307986551, 0.9421370661670921, 0.3428502857666232, 3.2872318967377147e-16], [3.9730362497396428, 1.2759865368889969, 1.1593698461048871, 0.9309428905977566])

In [34]:
# Adding row to a flat matrix
A=rand(6,10)
svdA=svd(A)
a=rand(10)
# Rank of the approximation
r=4


Out[34]:
4

In [35]:
svdAr=SVD(svdA.U[:,1:r], svdA.S[1:r],adjoint(svdA.V[:,1:r]))
svdAa=mySVDaddrow(svdAr,a);

In [36]:
Ar=svdAr.U*Diagonal(svdAr.S)*svdAr.Vt
svdvals(Aa),svdvals([Ar;transpose(a)]),svdAa.S


Out[36]:
([3.974688600527959, 1.2989639055449105, 1.1593964227176732, 0.9516010740072087, 0.7363080325810998, 0.43136995644832393], [4.549675903281804, 1.4400409837703734, 1.2308806688298173, 0.9136798177037393, 0.5866969846783837, 3.5919311192783594e-16, 1.7020661083452836e-16], [4.546959853941822, 1.4377919818811713, 1.2272341035372212, 0.8942604551872634])